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Determining  Differences  in  Rates 
Corresponding  to  a  Given  Significance  Level 


Introduction 


The  well-known  Chi-square  test  can  be  used  to  detect 
significant  differences  between  two  proportions.  For  example,  . 
in  Table  I,  we  have  N  patients,  broken  down  by  (A  +  B)  who 
received  a  placebo,  and  (C  +  D)  patients  who  received  treatment. 
The  proportion  of  control  patients  who  recovered  is  A/(A+B), 
and  the  proportion  of  treated  patients  who  recovered  is 
C/(C  +  D)  . 


No 

Recovery 

Recovery 

Controls 

A 

B 

Treated 

C 

D 

Totals 

A  +  C 

B  +  D 

Totals 


A  +  B 


C  +  D 


N  =  A  +  B  +  C  +  D 


Table  I 


To  determine  if  there  is  a  statistically  significant 
difference  between  these  proportions,  one  uses  the  Chi-square 
test  (formula)  for  2  <  2  tables.  For  moderate  to  large  N,  many 
statistical  computer  packages  (e.g.  SPSS-X)  utilize  the  formula 
with  Yates  correction,  that  is, 

2  „  _ N ( | AD  -  BC |  -  N/2) 2  (1) 

x  (A  +  B)  (C  +  D)  (A  +  C)  (B  +  D) 
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Let  us  assume  that  we  have  performed  a  Chi-square  test, 

and  we  do  not  get  a  significant  result  at  say,  the  5%  level. 

In  otherwords,  suppose  the  proportion  who  recovered  among  the 

treated  patients  is  larger  than  the  proportion  who  recovered 

among  the  control  patients  (i.e.,  C/(C+D)  >  A/(A  +  B)),  but 

the  difference  in  these  proportions  is  not  significant  at  the 

2 

5%  level  (i.e.,  x  <  3.84).  Now  if  a  result  is  significant 

at  the  5%  level,  we  are  implying  that  the  probability  is  less 

than  5%  that  the  result  is  due  to  chance.  One  might  ask  the 

following  question:  Assuming  constant  A,  B,  and  N,  how  large 

should  the  recovery  rate  of  the  treated  be  in  order  to  achieve 

2 

significance  at  the  5%  level  (i.e.,  x  1  3.84)? 

Modifying  Table  I,  we  arrive  at  Table  II. 


No 

Recovery 

Recovery 

Controls 

A 

3 

Treated 

y 

N  -  (A+3+y ) 

Totals 

A  +  y 

N  -  (A+y) 

Table  II 


Totals 

A  +  B 

N  -  (A  +  B) 

N 


We  now  discuss  how  to  find  the  treated  recovery  rate 


Computing  the  Recovery  Rate 


Applying  the  Chi-square  formula  (1)  to  Table  II,  we  obtain 


N{  LAW  -  (A  +  B+  y)]  -  By 

-  N/2}2 

(a  Vb)  [N  -  (A  +  B)]  (A  +  y) 

[N  -  (A  +  y ) 

If  we  assume  that  the  recovery  rate  is  higher  amcng  the  treated, 
that  is. 


N  -  (A  +  B) 


9 


then  the  quantity  inside  the  absolute  value  signs  is  negative. 
2 

Consequently  x  can  be  rewritten  as 

2  ^  N{  (A  +  B)  y  -  A  fN  -  (A  +  B)  ]  -N/2}2 
A  (A  +  B)  [N  -  (A  +  B)  ]  (A  +  y)  [N  -  (A  +  y)  ] 


which  leads  to  a  quadratic  equation  in  y. 


Defining 
£  =  A  +  B  , 

n  =  n  -  £  , 

u  =  N  -  A  ,  and 
a  s  An  +  N/2  , 
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we  ob\-  n 


ay2  -  8y  +  y  -  0  (2) 

where 

a  i  Nt  +  nx2  » 

6  =  2Na  ♦  nx2(u  -  A)  ,  and 
y  =  Na2/£  -  nx2Au 

Let  ue  assume  for  the  moment  that  Eq.  (2)  has  real  roots. 
This  will  be  verified  later  by  showing  that  the  discriminant 
of  the  quadratic  is  positive. 

Now,  even  for  very  small  levels  of  significance  (p-values) 
of  say,  .001  and  .0001,  x2  will  assume  values  only  as  large  as 
10.8  and  15.1,  respectively.  So  we  can  safely  assume  that 
N  >  x2*  (The  Chi-square  test  with  Yates'  correction  for  2  *  2 
tables  is  generally  not  used  for  tables  where  N  is  20  or  less.) 
This  assumption  implies  that  S  >  0.  Clearly,  a  >  0,  so  that 
the  sum  of  the  roots,  8/a  >  0. 

In  addition,  if  we  define 

6  =  n  -  5  ,  (3) 

A  +  X 

it  is  possible  to  show  that  the  condition  6  >_  0  implies  that 
Y  >  0.  Therefore,  under  this  condition,  the  product  of  the 


roots  y/ot  >  0,  and  we  are  insured  of  two  positive  roots. 
Regardless  of  the  value  of  <5,  we  can  be  certain  that  Eq.  (2) 
will  have  at  least  one  positive  root. 


We  now  show  that  the  discriminant  of  the  quadratic  is 

positive ,  so  we  can  be  certain  that  the  roots  are  real.  The 

2 

discriminant,  A  =  B  -  4ay  can  be  written  after  a  straight¬ 
forward  computation: 

A  -  (nNx2)2  +  4nN2x2[A  +  |]  [B  -  |] 

We  observe  B  _>  1 .  Otherwise,  if  B  -  0  we  would  have  100% 
of  the  controls  recovering.  Consequently  A  >  0  and  the  roots 
of  Eq.  (2)  are  real.  As  shown  earlier,  we  can  also  conclude 
that  at  least  one  of  the  roots  will  be  positive.  Before 
giving  an  example,  we  note  finally  that  any  positive  roots 
that  are  found  must  satisfy  the  inequality 


A 

A  +  B 


N  -  (A  +  B)  - 


i.e.,  the  recovery  rate  of  the  treated  must  be  greater  than 
the  rate  for  the  controls,  and  not  more  than  100%. 


Illustrative  Example 

Table  III  is  adapted  from  data  that  appears  in  Chinn  [1] , 
which  is  also  discussed  in  Bliss  [2] . 

Totals 


Placebo 


Dramamine 


Totals 


Well 

Sea-Sick 

18 

12 

y 

34  -  y 

18 +  y 

46  -  y 

Table  III 

A  *  18,  B 

*  12,  N  *  6 

30 


34 


64 


U  -  46,  u  -  A  »  28,  a  -  644,  Na2/€  *  884770.13,  and  x2  -  3.84 
(i.e.,  corresponding  to  the  5%  level  of  significance).  Equation 
( 2 )  then  becomes 


2050. 56y  -  86087. 68y  +  776666.45  *  0 


Using  Eq.  (3)  we  find  that  <$  *  25.68  so  we  expect  two  positive 
roots.  Solving  the  quadratic,  we  obtain  y  =  28.86  and  13.12. 
Since  inequality  (4)  must  be  satisfied,  we  reject  the  root 
y  *  13.12,  and  accept  the  root  y  =  28.86.  In  actuality,  our 
solution  must  be  in  integers,  and  we  obtain  [y]  +  1  =  29. 

Thus,  in  order  to  show  a  statistically  significant  difference 


in  the  rates  at  the  5%  level,  at  least  29  of  the  34  treated 


patients  must  recover  (i.e.,  stay  well). 

Conclusions 

The  problem  discussed  in  this  paper  arises  when  one  does 
not  get  a  significant  result  when  comparing  the  difference  in 
two  proportions  (rates)  in  a  2*2  contingency  table.  The 
research  worker  might  be  interested  in  determining  how  large 
a  difference  in  rates  is  required  (assuming  one  rate  is  held 
fixed)  in  order  to  show  statistical  significance  at  a  given 
level.  This  paper  provides  a  method  for  computing  the  required 
difference  for  an  arbitrary  level  of  statistical  significance. 
An  illustrative  example  applying  the  method  is  given,  based 
on  data  appearing  in  the  literature. 
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